Learning Taxonomy Adaptation in Large-scale Classification

نویسندگان

Rohit Babbar

Ioannis Partalas

Éric Gaussier

Massih-Reza Amini

Cécile Amblard

چکیده

In this paper, we study flat and hierarchical classification strategies in the context of largescale taxonomies. Addressing the problem from a learning-theoretic point of view, we first propose a multi-class, hierarchical data dependent bound on the generalization error of classifiers deployed in large-scale taxonomies. This bound provides an explanation to several empirical results reported in the literature, related to the performance of flat and hierarchical classifiers. Based on this bound, we also propose a technique for modifying a given taxonomy through pruning, that leads to a lower value of the upper bound as compared to the original taxonomy. We then present another method for hierarchy pruning by studying approximation error of a family of classifiers, and derive from it features used in a meta-classifier to decide which nodes to prune. We finally illustrate the theoretical developments through several experiments conducted on two widely used taxonomies.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Deep Unsupervised Domain Adaptation for Image Classification via Low Rank Representation Learning

Domain adaptation is a powerful technique given a wide amount of labeled data from similar attributes in different domains. In real-world applications, there is a huge number of data but almost more of them are unlabeled. It is effective in image classification where it is expensive and time-consuming to obtain adequate label data. We propose a novel method named DALRRL, which consists of deep ...

متن کامل

Sample-oriented Domain Adaptation for Image Classification

Image processing is a method to perform some operations on an image, in order to get an enhanced image or to extract some useful information from it. The conventional image processing algorithms cannot perform well in scenarios where the training images (source domain) that are used to learn the model have a different distribution with test images (target domain). Also, many real world applicat...

متن کامل

Large-Scale Many-Class Prediction via Flat Techniques

Prediction problems with huge numbers of classes are becoming more common. While class taxonomies are available in certain cases, we have observed that simple flat learning and classification, via index learning and related techniques, offers significant efficiency and accuracy advantages. In the PASCAL challenge on large-scale hierarchical text classification, the accuracies we obtained ranked...

متن کامل

Taxonomy of Communication Requirements for Large-scale Multicast Applications

The intention of this memo is to define a classification system for the communication requirements of any large-scale multicast application (LSMA). It is very unlikely one protocol can achieve a compromise between the diverse requirements of all the parties involved in any LSMA. It is therefore necessary to understand the worst-case scenarios in order to minimize the range of protocols needed. ...

متن کامل

Machine Learning and Citizen Science: Opportunities and Challenges of Human-Computer Interaction

Background and Aim: In processing large data, scientists have to perform the tedious task of analyzing hefty bulk of data. Machine learning techniques are a potential solution to this problem. In citizen science, human and artificial intelligence may be unified to facilitate this effort. Considering the ambiguities in machine performance and management of user-generated data, this paper aims to...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

Journal of Machine Learning Research

دوره 17 شماره

صفحات -

تاریخ انتشار 2016

Learning Taxonomy Adaptation in Large-scale Classification

نویسندگان

چکیده

منابع مشابه

Deep Unsupervised Domain Adaptation for Image Classification via Low Rank Representation Learning

Sample-oriented Domain Adaptation for Image Classification

Large-Scale Many-Class Prediction via Flat Techniques

Taxonomy of Communication Requirements for Large-scale Multicast Applications

Machine Learning and Citizen Science: Opportunities and Challenges of Human-Computer Interaction

عنوان ژورنال:

اشتراک گذاری